Turning a Monolingual Speaker into Multilingual for a Mixed-language TTS

نویسندگان

  • Ji He
  • Yao Qian
  • Frank K. Soong
  • Sheng Zhao
چکیده

We propose a new approach to rendering speech of different languages with only a speaker’s monolingual recordings for mixed-code TTS applications. A reference speaker in the target language (say Chinese) is used to help building the target language TTS with “tiles” of the original source speaker‘s monolingual (say English) data. The difference between the monolingual source speaker and the reference speaker is firstly equalized by warping spectral frequency, adapting F0 dynamics and adjusting speaking rate accordingly. Thus equalized Chinese sentences of the reference speaker are tiled with the best tiles of the monolingual English speaker’s segments frame by frame. In additional to a standard English TTS which is trained with his monolingual English recordings, a Chinese TTS of the same speaker is trained with those newly tiled sentences. A mixedlanguage (English-Chinese) TTS is built to synthesize high quality, mixed-language (Chinese-English) speech in one consistent quality voice which is confirmed in both objective and subjective evaluations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Building Mixed Lingual Speech Synthesis Systems

Codemixing phenomenon where lexical items from one language are embedded in the utterance of anotheris relatively frequent in multilingual communities. However, TTS systems today are not fully capable of effectively handling such mixed content despite achieving high quality in the monolingual case. In this paper, we investigate various mechanisms for building mixed lingual systems which are bui...

متن کامل

Constructing a Reusable Linguistic Resource for a Polyglot Speech Synthesis

This paper is about constructing sharable linguistic information to be used across languages for a Text-to-Speech (TTS) system. The data is obtained from existing resources. The focus of the paper is the phonetic and linguistic aspects. A monolingual TTS architecture is introduced with descriptions on each stage of processing. A multilingual TTS architecture is also introduced. Language depende...

متن کامل

Improvements in Non-Verbal Cue Identification Using Multilingual Phone Strings

Today’s state-of-the-art front-ends for multilingual speechto-speech translation systems apply monolingual speech recognizers trained for a single language and/or accent. The monolingual speech engine is usually adaptable to an unknown speaker over time using unsupervised training methods; however, if the speaker was seen during training, their specialized acoustic model will be applied, since ...

متن کامل

From multilingual to polyglot speech synthesis

This paper proposes a distinction between existing multilingual synthesis systems and mixed-lingual or polyglot synthesis systems. The latter should be capable of synthesising with the same voice utterances which contain foreign language words or word groups. As a first step towards polyglot synthetic speech, the design and realisation of a 4-lingual single-speaker diphone inventory is detailed...

متن کامل

A general approach to TTS reading of mixed-language texts

The paper presents the Loquendo TTS approach to mixedlanguage speech synthesis, offering a range of options to face the various situations where texts may occur in different languages or embedding foreign phrases. The most challenging target is to make a monolingual TTS voice read a foreign language text. The adopted Foreign Pronunciation Strategy here discussed allows mixing phonetic transcrip...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012